Before we go any further, if you are in Google Colab, go to File -> Save a copy in Drive. This will create a personal copy you can freely edit. So feel free to take notes here and modify the code. If you don't save a copy in your Drive though, your changes will disappear when you close the session.
Sentinel 2¶
- go to wiki page on Sentinel 2 at Copernicus
- skim for 5 minutes and try to identify 3-5 key characteristics of Sentinel 2
- optionally, identify one question you think the text can answer
Sentinel 2¶
- Part of European Space Agency's (ESA) Copernicus programme
- Twin sattelites
- Sentinel 2A since 23. June 2015
- Sentinel 2B since 7. March 2017
- Sentinel 2C, launched on 5.9.2024 -- to replace 2A

Coverage and time resolution¶
- At same angle every 5 days (10 days for same sattelite)
- Between $56^{\circ}$ S and $84^{\circ}$ degrees N
- Coast up to 20 km from land
- Islands over $100\ \rm{km^2}$
- Antarctica on demand
- Sun synchronous orbit at 786 km, 10:30 am descending node

Image from Copernicus
Sentinel 2 multispectral sensor¶
- 13 spectral bands in VIS, VNIR, SWIR
- Spatial resolution 10, 20, 60 m
- Push-broom, swath of 290 km

Image from eoportal.org
Products¶
All free!
- Level-1B
- Top-of-atmosphere calibrated radiances in sensor geometry
- Include radiometric corrections
- Not orthorectified
- Come in granules
- Level-1C
- Top-of-atmosphere reflectance
- Radiometrically and geometrically corrected
- Orthorectified and spatially coregistered on a global reference system.
- Come in tiles
- Level-2A
- Bottom-of-atmosphere (Surface) reflectance
- Otherwise same as L-1C
Examples¶
Let's look at the Copernicus browser
Sentinel 2 - Searching, downloading, loading up and working with it in Python¶
!pip install rasterio matplotlib numpy pyproj shapely
Collecting rasterio Downloading rasterio-1.4.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.1 kB) Requirement already satisfied: matplotlib in /usr/local/lib/python3.12/dist-packages (3.10.0) Requirement already satisfied: numpy in /usr/local/lib/python3.12/dist-packages (2.0.2) Requirement already satisfied: pyproj in /usr/local/lib/python3.12/dist-packages (3.7.2) Requirement already satisfied: shapely in /usr/local/lib/python3.12/dist-packages (2.1.2) Collecting affine (from rasterio) Downloading affine-2.4.0-py3-none-any.whl.metadata (4.0 kB) Requirement already satisfied: attrs in /usr/local/lib/python3.12/dist-packages (from rasterio) (25.4.0) Requirement already satisfied: certifi in /usr/local/lib/python3.12/dist-packages (from rasterio) (2025.10.5) Requirement already satisfied: click>=4.0 in /usr/local/lib/python3.12/dist-packages (from rasterio) (8.3.0) Collecting cligj>=0.5 (from rasterio) Downloading cligj-0.7.2-py3-none-any.whl.metadata (5.0 kB) Collecting click-plugins (from rasterio) Downloading click_plugins-1.1.1.2-py2.py3-none-any.whl.metadata (6.5 kB) Requirement already satisfied: pyparsing in /usr/local/lib/python3.12/dist-packages (from rasterio) (3.2.5) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.3.3) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (4.60.1) Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.4.9) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (25.0) Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (11.3.0) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (2.9.0.post0) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.7->matplotlib) (1.17.0) Downloading rasterio-1.4.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22.3/22.3 MB 45.1 MB/s eta 0:00:00 Downloading cligj-0.7.2-py3-none-any.whl (7.1 kB) Downloading affine-2.4.0-py3-none-any.whl (15 kB) Downloading click_plugins-1.1.1.2-py2.py3-none-any.whl (11 kB) Installing collected packages: cligj, click-plugins, affine, rasterio Successfully installed affine-2.4.0 click-plugins-1.1.1.2 cligj-0.7.2 rasterio-1.4.3
import numpy as np
import os
import matplotlib
import matplotlib.pyplot as plt
from matplotlib import cm
The question¶
First half of 2022, there was a drought in Norway, so we could expect that the vegetation in 2022 wasn't as healthy as usual. This should show up by NDVI being generally lower in 2022 than in 2021.
The plan is to
- download one image from the same month each year
- calculate NDVI for each of them
- compare
Get the images¶
- go to Copernicus browser
- sign in (you have to be signed in to download)
- pick your area (I am picking my home :))
- draw the area of interest using the tools on the right side of the screen
- note down the corner coordinates of your area of interest (they show up on the bottom of the map when you hover over)
- use the Search tab in the left pane
- choose the right platform, image product and cloud coverage
- define time period as May 2021
- click on search
- choose an image that looks good
- download it
- repeat for 2022
Check the files¶
Where are the images? Unzip and explore the folder structure.
Adding the files to Colab¶
- open your Google Drive
- make a folder called "S2Intro-data"
- upload the two .SAFE folders into it
- mount your drive to google colab with the following command
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
Load the data¶
The folder structure of Sentinel 2 data is a little convoluted and the filenames are long. To avoid having to copy/paste the names, which is error prone, we load them by listing the contents of the folders.
data_dir = '/content/drive/MyDrive/S2Intro-data/'
may_2021_dir = [f.path for f in os.scandir(data_dir)
if '.SAFE' in f.name and '202105' in f.name][0]
may_2022_dir = [f.path for f in os.scandir(data_dir)
if '.SAFE' in f.name and '202205' in f.name][0]
print(may_2021_dir)
print(may_2022_dir)
/content/drive/MyDrive/S2Intro-data/S2B_MSIL2A_20210530T104619_N0500_R051_T32VPM_20230228T112003.SAFE /content/drive/MyDrive/S2Intro-data/S2A_MSIL2A_20220517T103631_N0400_R008_T32VPM_20220517T161111.SAFE
# Function to find files with specific strings in the filename
# Which strings do you think we need to find the different band files?
def find_file(root_folder, str_list):
"""
Find a file in a folder (and its subfolders) that contains
all strings in str_list in its filename.
Returns the full path to the file.
"""
for root, dirs, files in os.walk(root_folder):
for filename in files:
if all(x in filename for x in str_list):
return os.path.join(root, filename)
print(find_file(may_2021_dir, ['2021','B02', 'jp2']))
/content/drive/MyDrive/S2Intro-data/S2B_MSIL2A_20210530T104619_N0500_R051_T32VPM_20230228T112003.SAFE/GRANULE/L2A_T32VPM_A022099_20210530T104621/IMG_DATA/R20m/T32VPM_20210530T104619_B02_20m.jp2
import rasterio
# Open bands 2, 3, 4 and 8
# 2021
m2021_band2=rasterio.open(find_file(may_2021_dir, ['2021','B02', 'jp2']))
m2021_band3=rasterio.open(find_file(may_2021_dir, ['2021','B03', 'jp2']))
m2021_band4=rasterio.open(find_file(may_2021_dir, ['2021','B04', 'jp2']))
m2021_band8=rasterio.open(find_file(may_2021_dir, ['2021','B08', 'jp2']))
2022
m2022_band2=rasterio.open(find_file(may_2022_dir, ['2022','B02', 'jp2']))
m2022_band3=rasterio.open(find_file(may_2022_dir, ['2022','B03', 'jp2']))
m2022_band4=rasterio.open(find_file(may_2022_dir, ['2022','B04', 'jp2']))
m2022_band8=rasterio.open(find_file(may_2022_dir, ['2022','B08', 'jp2']))
# Check the properties of one of the bands
print(m2021_band2.profile)
# You can check the others too
{'driver': 'JP2OpenJPEG', 'dtype': 'uint16', 'nodata': None, 'width': 5490, 'height': 5490, 'count': 1, 'crs': CRS.from_wkt('PROJCS["WGS 84 / UTM zone 32N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",9],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32632"]]'), 'transform': Affine(20.0, 0.0, 600000.0,
0.0, -20.0, 6700020.0), 'blockxsize': 640, 'blockysize': 640, 'tiled': True}
Plotting the bands¶
# Fill in the blanks
fig = plt.figure(figsize=(20,6))
ax1 = fig.add_subplot(1,3,1)
ax1.imshow(m2022_band4.read(1), cmap='Reds')
ax2 = fig.add_subplot(1,3,2)
ax2.imshow(..., cmap='Greens')
ax3 = fig.add_subplot(1,3,3)
ax3.imshow(..., cmap='Blues')
##
fig = plt.figure(figsize=(20,6))
ax1 = fig.add_subplot(1,3,1)
ax1.imshow(m2022_band4.read(1), cmap='Reds')
ax1 = fig.add_subplot(1,3,2)
ax1.imshow(m2022_band3.read(1), cmap='Greens')
ax1 = fig.add_subplot(1,3,3)
ax1.imshow(m2022_band2.read(1), cmap='Blues')
<matplotlib.image.AxesImage at 0x789629b275f0>
Plotting the RGB¶
That comes with the granule
from rasterio.plot import show
RGB_thumbnail_file = find_file(may_2022_dir, ['.jpg'])
print(RGB_thumbnail_file)
RGB_thumbnail = rasterio.open(RGB_thumbnail_file)
# What do you get when you look at the properties?
/content/drive/MyDrive/S2Intro-data/S2A_MSIL2A_20220517T103631_N0400_R008_T32VPM_20220517T161111.SAFE/S2A_MSIL2A_20220517T103631_N0400_R008_T32VPM_20220517T161111-ql.jpg
/usr/local/lib/python3.12/dist-packages/rasterio/__init__.py:356: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
print(RGB_thumbnail.profile)
{'driver': 'JPEG', 'dtype': 'uint8', 'nodata': None, 'width': 343, 'height': 343, 'count': 3, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0,
0.0, 1.0, 0.0), 'blockxsize': 343, 'blockysize': 1, 'tiled': False, 'compress': 'jpeg', 'interleave': 'pixel', 'photometric': 'ycbcr'}
show(RGB_thumbnail)
<Axes: >
Limiting the area¶
Define the polygon¶
import shapely
import pyproj
from shapely.ops import transform
# bounding box coordinates of the area of interest in
# WGS84 CRS, [minx, miny, maxx, maxy]
bbox_coords = [60, 11.35, 60.21, 11.8] # find your for example
# Let's create a shapely geometry of the bounding box
polygon = shapely.geometry.box(*bbox_coords)
# what happens if you remove the asterisk?
# Get it into the right coordinate system
project = pyproj.Transformer.from_proj(
pyproj.Proj('epsg:4326'), # source coordinate system
pyproj.Proj('epsg:32632')) # destination coordinate system
# ! make sure you change to your destination coordinate system!
# apply projection
transformed_polygon = transform(project.transform, polygon)
Crop all the bands and save as one tif¶
import rasterio.mask
out_band2, out_transform = rasterio.mask.mask(m2021_band2,
[transformed_polygon],
crop=True)
out_meta = m2021_band2.meta
out_meta.update({"driver": "GTiff",
"count" : 4,
"height": out_band2.shape[1],
"width": out_band2.shape[2],
"transform": out_transform})
out_band3,_ = rasterio.mask.mask(m2021_band3, [transformed_polygon], crop=True)
out_band4,_ = rasterio.mask.mask(m2021_band4, [transformed_polygon], crop=True)
out_band8,_ = rasterio.mask.mask(m2021_band8, [transformed_polygon], crop=True)
with rasterio.open(data_dir+"/may2021_r10_masked.tif",
"w", **out_meta) as dest:
dest.write(out_band2.squeeze(),1)
dest.write(out_band3.squeeze(),2)
dest.write(out_band4.squeeze(),3)
dest.write(out_band8.squeeze(),4)
Do the same for 2022
Let's plot it as RGB¶
img1 = rasterio.open(data_dir+"/may2021_r10_masked.tif")
rgb_image1 = np.array([img1.read(3), img1.read(2), img1.read(1)])
# Values in optical bands come in digital units DN,
# where DN = 10 000 * reflectance where reflectance is 0-1
# Note, there is an offset of 1000 that is added to DNs
# to avoid negative numbers.
# https://docs.sentinel-hub.com/api/latest/data/sentinel-2-l2a/
# Using 3 to brighten the images, because while 10 000 is the theoretical
# max value that would be objects reflecting 100% of the incoming light
# Average albedo of the surface is 0.3, meaning you're only ever going to see
# ~30% of the sunlight reflected. Include scattering, absorption etc.
# and reasonable values are in the 20% range
print(np.unique(rgb_image1))
rgb_image1=(rgb_image1-1000.0)/10000*3
rgb_image1[rgb_image1>1]=1
rgb_image1[rgb_image1<0]=0
img2 = rasterio.open(data_dir+"/may2022_r10_masked.tif")
rgb_image2 = np.array([img2.read(3), img2.read(2), img2.read(1)])
rgb_image2=(rgb_image2-1000.0)/10000*3
rgb_image2[rgb_image2>1]=1
rgb_image2[rgb_image2<0]=0
# Use print statements to find out what kind of values the image comes as
[ 0 996 997 ... 6770 6944 7118]
# Plot the resulting images
fig, ax = plt.subplots(1,2, figsize=(20,12))
show(rgb_image1, ax=ax[0])
show(rgb_image2, ax=ax[1])
<Axes: >
NDVI images¶
# Fill in the gaps ...
# There is an offset of 1000 to the values
red_2021 = (img1.read(...) - ...).astype(int)
red_2021[red_2021<0] = 0
red_2022 = (img2.read(...) - ...).astype(int)
red_2022[red_2022<0] = 0
nir_2021 = (img1.read(...) - ...).astype(int)
nir_2021[nir_2021<0] = 0
nir_2022 = (img2.read(...) - ...).astype(int)
nir_2022[nir_2022<0] = 0
##
# There is an offset of 1000 to the values
red_2021 = (img1.read(3) - 1000.0).astype(int)
red_2021[red_2021<0] = 0
red_2022 = (img2.read(3) - 1000.0).astype(int)
red_2022[red_2022<0] = 0
nir_2021 = (img1.read(4) - 1000.0).astype(int)
nir_2021[nir_2021<0] = 0
nir_2022 = (img2.read(4) - 1000.0).astype(int)
nir_2022[nir_2022<0] = 0
print(np.unique(red_2022))
print(np.unique(red_2021))
print(np.unique(nir_2022))
print(np.unique(nir_2021))
[ 0 3 4 ... 6573 6792 7126] [ 0 1 2 ... 5076 5095 6118] [ 0 1 2 ... 7464 8376 9392] [ 0 1 2 ... 6988 7040 7072]
# Then we can calculate the NDVI
# Fill in the gaps
ndvi_2021 = (...)/(... + 1e-6)
ndvi_2022 = (...)/(... + 1e-6)
# Are you getting values that make sense?
##
# Then we can calculate the NDVI
ndvi_2021 = (nir_2021-red_2021)/(nir_2021+red_2021 + 1e-6)
ndvi_2022 = (nir_2022-red_2022)/(nir_2022+red_2022 + 1e-6)
print(np.unique(ndvi_2021))
print(np.unique(ndvi_2022))
[-1. -1. -1. ... 1. 1. 1.] [-1. -1. -1. ... 1. 1. 1.]
# Mean NDVI
print("Mean NDVI in 2021:", ndvi_2021.mean())
print("Mean NDVI in 2022:", ndvi_2022.mean())
Mean NDVI in 2021: 0.5772890831617853 Mean NDVI in 2022: 0.5236704960085511
# And let's plot the NDVI in grayscale
# Higher value gives darker color
fig, ax = plt.subplots(1,2, figsize=(12,5))
ax[0].imshow(ndvi_2021, cmap='Grays')
ax[1].imshow(ndvi_2022, cmap='Grays')
<matplotlib.image.AxesImage at 0x789604ae2210>
Let's look at the difference¶
We expect the health of the vegetation to be better in 2021, because of the drought in 2022. The NDVI should then be higher for 2021 than for 2022.
We should then expect the difference ndvi_2021 - ndvi_2022 to be mostly positive.
Let's check if that is the case.
# Plotting the NDVI difference
diff = ndvi_2021 - ndvi_2022
fig, ax = plt.subplots(1, figsize=(6,6))
pic = ax.imshow(diff, norm=matplotlib.colors.CenteredNorm(),
cmap = cm.coolwarm)
fig.colorbar(pic, ax=ax)
<matplotlib.colorbar.Colorbar at 0x789604c287a0>
diff_binary = diff
diff_binary[diff_binary>0]=1
diff_binary[diff_binary<0]=-1
fig, ax = plt.subplots(1, figsize=(6,6))
pic = ax.imshow(diff_binary, norm=matplotlib.colors.CenteredNorm(),
cmap = cm.coolwarm)
Save the NDVI difference as a Geotiff¶
# Open one of the orginal geotiffs, so we can copy the metadata from it
with rasterio.open(data_dir+'/may2021_r10_masked.tif') as src:
out_image = diff
out_meta = src.meta.copy()
# Update the metadata that are different for our new image
out_meta.update({
"driver": "GTiff",
"count": 1, #only one band instead of 12
"dtype": "float64", #float instead of int
})
# Use rasterio to write the numpy array into a geotiff
# using the defined metadata
with rasterio.open(data_dir+"/NDVI_diff.tiff", "w", **out_meta) as dest:
dest.write(diff, indexes=1)

